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ABSTRACT 

We present an analysis of the relative bias between early- and late-type galaxies in the 
Two-degree Fiel d Galaxy Redshift Survey (2dFGRS) - as defined by the rj parameter of 
iMadgwick et alJ ( 12002) . which quantifies the spectral type of galaxies in the survey. Our 
analysis examines the joint counts in cells between early- and late-type galaxies, using ap- 
proximately cubical cells with sides ranging from Ih^^Mpc to 42/i^^Mpc. We measure the 
variance of the counts in cells using the method of Efstathiou et al. (1990), which we find 
requires a correction for a finite volume effect equivalent to the integral constraint bias of the 
autocorrelation function. Using a maximum likelihood technique we fit lognormal models to 
the one-point density distribution, and develop methods of dealing with biases in the recov- 
ered variances resulting from this technique. We use a modified technique to determine to 
what extent the relative bias is consistent with a simple linear bias relation; this analysis re- 
sults in a significant detection of nonlinearity/stochasticity even on large scales. We directly fit 
deterministic models for the joint density distribution function, /{Se, Sl), to the joint counts 
in cells using a maximum likelihood technique. Our results are consistent with a scale in- 
variant relative bias factor on all scales studied. Linear bias is ruled out on scales less than 
£ = 28/i^^Mpc. A power-law bias model is a significantly better fit to the data on all but the 
largest scales studied; the relative goodness of fit of this model as compared to that of the 
linear bias model suggests that any nonlinearity is negligible for £ > 40/i^^Mpc, consistent 
with the expectation from theory that the bias should become linear on large scales. 

Key words: galaxies: statistics, distances and redshifts - large-scale structure of the Universe 
- surveys 
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1 INTRODUCTION 

Measurements of large-scale structure from galaxy redshift sur- 
veys obviously measure the distribution of luminous matter only; 
the total mass distribution will be dominated by dark matter The 
question of how the galaxies trace the total matter density field is 
therefore extremely pertinent, both to the estimation of cosmolog- 
ical parameters, and also as a probe of the physics of galaxy for- 
mation. A common assumption is 'linear biasing', which can be 
expressed 5g — bSm, where Sg, 5m are the fractional overdensi- 
ties relative to the mean in galaxies and mass respectively. This 
assumption becomes unphysical when 6 > 1 since, by definition, 
5g ^ —1, but we can still define a bias parameter, fe(r), by e.g. 
fss('') ~ b{r)'^S^rnm{r). Many of the constraints on cosmological 
parameters derived from large-scale structure measurements rely 
on an understanding of galaxy bias. Both the 2dFGRS power spec- 
trum analysis (Percival et al. 2001) and the constraints obtained 
for the neutrino mass (Elgar0y & Lahav 2003) assume scale inde- 
pendent bias. Joint constraints obtained by combining the 2dFGRS 
results with measurements of the CMB power spectrum (Percival 
et al. 2002; Efstathiou et al. 2002; Verde et al. 2003) also require a 
model for galaxy bias. Dekel & Lahav (1999) show that nonlinear- 
ity and stochasticity in the bias relation can explain discrepancies 
between different methods of measuring parameters which assume 
a linear bias factor, such as measurements of /3 = /h (Peacock 
et al. 2001; Hawkins et al. 2003). 

In fact both theoretical approaches (Mo & White 1996) and 
simulations predict that bias may be non-linear and scale depen- 
dent, at least on some (small) scales. Kauffmann et al. (1997) find 
only weak scale dependence on large scales and a bias relation con- 
sistent with linear bias. Benson et al. (2000) find that semi-analytic 
galaxies in a LCDM model could reproduce the APM correlation 
function given a scale dependent bias taking the form of an an- 
tibias of galaxies relative to matter on small scales. Somerville 
et al. (2001) also use semi-analytic modelling to demonstrate that 
the physics of galaxy formation introduces a small scatter in the 
galaxy-mass relation; they find the mean bias to have only a weak 
dependence on scale for r < 12/!,~^Mpc, (where the Hubble con- 
stant. Ho = lOOftkms"^). 

In principle the true mass distribution can be directly mea- 
sured from measurements of galaxy peculiar velocities using 
e.g. POTENT reconstruction (Dekel, Bertschinger & Faber 1990). 
In practice accuracy is hard to achieve by such methods; the tech- 
nique requires heavy smoothing since the error bars per galaxy are 
large and the volumes surveyed up to the present are relatively lo- 
cal. A useful probe is instead to compare the clustering of different 
types of galaxy: if these cluster differently, at least one type cannot 
exactly follow the mass distribution. 

It has been known for some considerable time that galaxies of 
different morphological type have different clustering properties. 
Early-type galaxies, such as ellipticals or SOs, are highly clustered, 
accounting for almost 90% of galaxies in the cores of rich clusters. 
This fraction drops off steeply, however, with distance from the 
cluster cores and in the field 70% of galaxies are late-type galax- 
ies: spirals and irregulars (Dressier 1980; Postman & Geller 1984). 
The level of fluctuations in each of the early- and late-type den- 
sity fields can also be compared using the correlation functions or 
power spectra for the two sub-populations. This kind of study is 
optimised for small separations (< 10/i~^Mpc) and has generally 
revealed that the clustering amplitude of ellipticals is greater than 
that of spirals by a factor of 1.3-1.5 (e.g. Loveday et al. 1995; Nor- 
berg et al. 2002a; Madgwick et al. 2003b). If both density fields 



were perfectly correlated with the matter density field this factor 
would be equivalent to the ratio between linear bias parameters 
{bE{r)/bL{r))^ . There is also evidence that the relative bias be- 
tween sub-populations of galaxies is more complex than the global 
galaxy bias. Measurements of the 2dFGRS bispectrum (Verde et 
al. 2002) found no evidence for nonlinearity in the bias for 2dFGRS 
galaxies. More recently however. Kayo et al. (2004) find evidence 
for relative bias being complex on weakly non-linear to non-linear 
scales from a measurement of the redshift-space three-point corre- 
lation function, as a function of galaxy colour and morphology, in 
the Sloan Digital Sky Survey. Wild et al. (2004) have carried out 
a counts-in-cells analysis using volume limited samples from the 
2dFGRS, and find evidence for non-linearity and stochastic effects. 

A detailed framework for dealing with possib le nonlinearities 
and st ochasticity in the bias relation is given bv iDekel & Lahavl 
h999l), based on the joint probability distribution of the galaxy and 
mass densities f{5g , 5™,)- In an analogous manner we will consider 
the joint probability distribution of the early- and late-type galaxy 
density fields for magnitude limited samples in the 2dFGRS. This 
approach in large part follows the methods described in Blanton 
(2000) for the Las Campanas Redshift Survey (LCRS), although 
the geometry of the 2dFGRS is considerably more amenable to this 
kind of study than that of the LCRS and allows us, for example, to 
examine a large range of scales. 

This paper is organised into sections as follows. In Section|2| 
we summarize details of the 2dFGRS, the PCA-77 parameter and 
the division into cells. We present a measurement of the variances 
of the counts in cells using the method of Efstathiou et al. (1990), 
which we have corrected for integral constraint bias, in Section|3] 
In Section|4]we present an analysis of the one-point distribution of 
the counts in cells based on fits to a lognormal distribution function. 
In Section|5]we discuss the relative bi as. We present th e results of 
applying the 'modified x^' statistic of lTeemM^ il999h to the joint 
counts in cells and then move on to describe the application of the 
maximum-likelihood technique of Blanton |2000) to constrain the 
relative bias between spectral types. We summarize our conclusions 
in Section|6| 



2 THE 2dF GALAXY REDSHIFT SURVEY 

The 2dFGRS observations were carried out between May 1997 and 
April 2002 using the 2dF instrument: a multi-object spectrograph 
on the Anglo Australian Telescope. The main survey region con- 
sists of two broad strips, one in the South Galactic Pole region 
(SGP) covering approximately -37°.5 <5 < -22°.5, 2l''40™ < 
a < 3*^40™ and the other in the direction of the North Galactic 
Pole (NGP), spanning -7°.5 <5 < 2°.5, 9''50™ <a< 14''50™. 
In addition there are a number of circular two-degree fields scat- 
tered randomly over the full extent of the low extinction regions of 
the southern APM galaxy survey. 

The parent catalogue for the survey was selected in the pho- 
tometric 6j band from a revised and extended version of the APM 
galaxy survey (Maddox, Efstathiou & Sutherland 1990a,b,c; 1996). 
The magnitude limit at the start of the survey was set at bj = 19.45 
but both the photometry of the input catalogue and the dust extinc- 
tion map have since been revised and so there are small variations in 
magnitude limit as a function of position over the sky. The effective 
median magnitude limit, over the area of the survey, is bj ~ 19.3 
(Colless et al. 2001; Colless et al. 2003). 

The completeness of the survey data varies according to the 
position on the sky because of unobserved fields (mostly around 
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the survey edges), untargeted objects in observed fields (due to col- 
lision constraints or broken fibres) and observed objects with poor 
spectra; also there are drill-holes around bright stars. The variation 
in completeness with angular position, 9, is fully described by the 
completeness mask (Colless et al. 2001; Norberg et al. 2002b; Col- 
less et al. 2003). Note that since we use exclusively those galaxies 
for which a principal component spectral type has been derived, we 
require a slightly modified completeness mask from that describing 
the completeness of the full survey, one which reflects the com- 
pleteness of galaxies with measured 7;-type (Norberg et al. 2002a). 

We use the completed 2dFGRS data set which was released 
publicly at the end of June 2003 (Colless et al. 2003). This includes 
221 414 unique, reliable galaxy redshifts (quality flag ^ 3, Colless 
et al. 2001; Colless et al. 2003). The random fields, which contain 
nearly 25 000 reliable redshifts, are not included in this analysis. 
Throughout the paper we treat the NGP and SOP regions as inde- 
pendent data sets, which means we have two estimates for each of 
the statistics we derive. This approach functions both as a 'reality 
check' for our error estimates and also gives an idea of the variation 
due to cosmic variance. 



2.1 PC A classification of galaxy spectra 

The spectral properties of 2dFGRS galaxies have been analysed 
and the galaxies split into spectral type classes using a principal 
component analysis (PCA) described by Madgwick et al. (2002). 
This technique splits the galaxies on the basis of the characteristics 
of their spectra which show the most variation across the sample, 
without using any prior assumptions or template spectra. Madg- 
wick et al. (2002) define a scalar parameter, 77, which is a linear 
combination of the first two principal components chosen to min- 
imize instrumental effects which make the determination of the 
continuum uncertain. In effect, 77 quantifies the relative strengths 
of emission and absorption lines, and can be shown to be tightly 
correlated to the equivalent width of Hq in particular, so a sim- 
ple physical interpretation of 77 is as a measure of the current star 
formation rate in a galaxy (Madgwick et al. 2003a). 

The PCA classification makes use of the spectral informa- 
tion in the rest-frame wavelength range 37OOA to 6650A, which 
includes all the major optical diagnostics between Oil and Ha. 
The spectral coverage imposes a limit on the maximum redshift 
at which this analysis can be used of z = 0.2. For galaxies with 
2: > 0.15 however, sky absorption bands contaminate the Ha line, 
which affects the stability of the classification. For this reason we 
restrict our analysis to galaxies with 2 < 0.15 following Madgwick 
et al. (2002). 

The distribution of 77 for the 2dFGRS spectra is clearly bi- 
modal (see fig. 4 of Madgwick et al. 2002), with a shoulder at 
77 = —1.4. Madgwick et al. (2002) divide galaxies into four spec- 
tral type bins based on the shape of this distribution; the local mini- 
mum at 77 = —1.4 is used to separate early and late types while the 
late type 'shoulder' is divided in two and also separated from the 
tail, which will be dominated by particularly active galaxies such as 
starbursts and AGN. Because of the effects of possible evolution in 
the last two spectral type bins, discussed further in the next section, 
we use only the spectral classes 1 & 2 of Madgwick et al. (2002) 
in this paper, which we refer to as early and late type respectively, 
and exclude the bluer classes 3 & 4. 

The important aspect of the PCA classification used here is 
that it represents a coherent method for dividing a galaxy sample 
into classes based on a diagnostic with a relatively clear physical 
interpretation, i.e. current star formation rate. Note that this specific 



interpretation will not necessarily be the case for all galaxy samples 
classified using a PCA method; comparing classifications based on 
a PCA analysis between samples is in general nontrivial. In our 
case, however, the fact that the PCA classification of the 2dFGRS 
is dominated by Ha means that a classification of galaxies based 
on T] is virtually the same as a classification based on the equivalent 
width of Ha. 



2.2 Counts-in-cells 

The analysis presented in this paper is complementary to ap- 
proaches to the study of the relative bias based on correlation func- 
tions, in that we compare the two density fields on a point by point 
basis, rather than measuring overall clustering amplitudes. To es- 
timate the local galaxy density contrast for each type we use the 
method of counts in cells. Analysis of the 2dFGRS counts in cells 
has already been used by Croton et al. (2004a,b) and Baugh et al. 
(2004) to constrain the higher-order correlation functions and the 
void probability function. Another respect in which this analysis 
is complementary to the correlation function approach is that it is 
optimised for much larger scales. 

The method by which we divide the survey region into cells 
is identical to that first used by ^fstathiou et al. ( 1990) for measur- 
ing the variance of the cou nts in cells for a sparse-samp led redshift 
survey of IRAS galaxies iRowan-Robinson et alj[l990iV First the 
surveyed region of space is divided into spherical shells of thick- 
ness I centred on the observer. Each shell is then subdivided into 
approximately cubical cells using lines of constant right ascension 
and declination, chosen for each radial distance and declination to 
ensure that the sides of the cell of length ~ I, the shell separa- 
tion. In this paper we analyse cell divisions over a range in i from 
I = 7/1" ^Mpc to £ = 42ft" ^Mpc. We assume a concordance cos- 
mology of Q.m ~ 0.3, f^A = 0.7. Effective scales corresponding 
to a given value of £ can be computed using the approximate re- 
lationship between the radius of a Gaussian sphere window, Rq, 
the radius of a spherical top-hat window, Rt, and £, given in Eq.Q 
(Peacock 1999). 



Rg 



Rt 



(1) 



Any analysis of the galaxy counts in cells for a flux limited 
survey must take into account the selection function, which quan- 
tifies the probability that a galaxy with a given redshift, 2, is in- 
cluded in the survey. We define Mminiz, 9) and Afmax(2, 6) to be 
the minimum and maximum absolute magnitudes visible at redshift 
2 given the magnitude limit of the survey which, in the case of the 
2dFGRS, varies with angular position 9 as described in Section|2| 
and we take the luminosity function $(Af ) to be normalized in that 
range. Then the selection function can be written 



{z,9) = 



dM ${M)c49). 



(2) 



The Cz {9) term describes the variation in completeness of the sur- 
vey over the sampled area, for which we use the survey mask for 
77-typed galaxies described in Section|2| The completeness also de- 
pends slightly on apparent magnitude and, for the full survey, one 
can use the relation given in Colless et al. (2001) to parameterise 
this variation: 

c,(0,foj) =7{l-exp[6j-M(0)]}, (3) 
where 7 is set at 0.99 and ij.{9) can be set by the requirement that 
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Figure 1. An example of the division into cells for the NGP (top) and SGP (bottom) regions, with cells of side £ = 14/i ^^Mpc; the cells spanning the central 
declination of each slice are plotted along with the galaxies of each type (early types in red & on the left, late types in blue & on the right) in the cell. The 
colour scales indicate the estimated galaxy density contrast as defined in the text. 



Table 1. Total number of cells and expected counts (presented as the 16%-50%-84% percentiles of the distribution). 
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7.0 


11056 


2.0-3.2-5.7 


1.2-2.2-4.7 


14593 


2.0-3.0-5.0 


1.3-2.2^.5 


8.75 


5689 


2.8^.5-8.3 


1.7-3.0-6.7 


7543 


2.7^.1-6.9 


1.8-3.0-6.2 


10.5 


3170 


4.0-6.5-11.7 


2.5^.4-9.5 


4114 


3.8-5.8-9.6 


2.6^.3-8.7 


12.25 


2026 


5.4-8.9-16.4 


3.4-6.0-13.3 


2567 


5.3-7.8-13.1 


3.6-5.9-12.0 


14.0 


1198 


8.4-13.1-23.3 


5.3-8.8-18.9 


1484 


7.7-11.4-18.7 


5.5-8.8-17.4 


17.5 


620 


13.7-22.8^0.5 


8.7-15.7-32.6 


729 


12.7-18.3-30.0 


9.2-14.4-28.3 


21.0 


336 


24.5-39.3-69.4 


15.7-26.8-56.0 


372 


22.0-32.8-52.4 


15.8-25.5^9.8 


24.5 


187 


41.3-64.6-117 


25.8^3.3-95.6 


205 


36.0^9.2-80.3 


26.6-38.6-75.9 


28.0 


113 


63.1-95.9-175 


39.1-63.1-136 


117 


51.8-77.0-114 


38.1-61.0-110 


31.5 


80 


98.1-162-252 


61.8-109-197 


125 


86.3-122-179 


62.6-99.5-171 


35.0 


57 


140-226-321 


85.2-152-250 


101 


120-160-260 


85.9-125-255 


38.5 


45 


183-278-390 


112-184-301 


77 


143-175-305 


104-138-296 


42.0 


34 


248-351^76 


156-237-349 


53 


168-227-388 


125-176-369 
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(4) 



where the average is over all apparent magnitudes, &j. 

Unfortunately, the ^(0) mask is undefined for ?7-typed galax- 
ies. To generate such a mask one would need to assume a form 
for the number counts as a function of t), as well as making an as- 
sumption for the variation of completeness with limiting redshift, 
-Zmax, since the 77-parameter is only defined for z ^ 0.15. Given the 
large number of assumptions which would be necessary we have 
not included a correction for the effect of apparent magnitude on 
completeness in this study; the effect is in any case a small one, 
particularly in a flux limited catalogue. 

Once we have a knowledge of the selection function we can 
define the expected number of galaxies in each cell i by 



iVc: 



(5) 



Vi 



Where the integral is over the volume of the cell i. 

We use expected counts in cells {NE,B^p,i and NL,Bxp,i for 
early and late types respectively), which we ob tain by integrat- 
ing the ?7-type dependent luminosity functions of Madgw ick et alj 
y003) using the average magnitude limit over the surface of each 
cell, (Colless et al. 2001; Colless et al. 2003) and corrected for 
0^(0) as described above. We reject from the analysis cells for 
which the average completeness over the cell is less than 70%. We 
also renormalize the expected counts to ensure that (N/Ncxp) = 1 
in order to correct for possible errors in the normalization of the 
luminosity functions. We find that this choice of renormalization 
gave the most stable results although the exclusion of empty cells 
from the renormalization step was necessary to ensure stability (see 
Section l42l . In practice the details of this renormalization step do 
not significantly affect the best fitting parameters for the variance 
or relative bias, but they do change the KS test probabilities for our 
models. 

This approach to dealing with the selection function for sub- 
sets of the data, such as the division into early and late-type galax- 
ies, is vulnerable to systematic errors in the selection function. An 
example of such an effect would be a surface-brightness term in 
the selection function for ?;-typed galaxies, which one would ex- 
pect would affect early- and late-type galaxies differently. Any sys- 
tematic error in calculating the expected number of counts in cells 
for the early- or late-type galaxies will bias results for the non- 
linearity and stochasticity of the bias function. Such an effect was 
noted in Blanton (2000), who found that his results were sensi- 
tive to excluding the low redshift part of his sample. Madgwick et 
al. (2002) detect a large overabundance of spectral types 3 and 4 
beyond z — 0.11, relative to the predicted n{z) based on the lumi- 
nosity functions for these spectral types. Although such an observa- 
tion could be due to aperture effects, a more plausible explanation 
is the presence of evolution for these spectral type bins. In princi- 
ple, such evolution could be modelled in our analysis, and we could 
derive accurate A'cxp for all late-type galaxies. If there is evolution 
in these spectral types however, we may expect that the relative bias 
could also be evolving over the redshifts used in this analysis. For 
this reason we have used only 77-types 1 & 2 in the analysis. 

Fig. Qshows the division into cells for I = 14/i~^Mpc. The 
galaxies in each cell of the respective spectral type are shown over- 
laid on a colour scale indicating the estimated galaxy density con- 
trast in that cell. The intermediate density contrasts are much less 
prevalent in the early-type density field, implying that the contrast 
between clusters and voids is increased, as one would expect. Ta- 
bleQshows the number of cells in each cell division together with 



the median and 16% and 84% percentiles of the distribution of ex- 
pected counts. 

Wild et al. (2004) have also analysed colour selected volume 
limited samples from the 2dFGRS. Unfortunately, luminosity func- 
tions for different colour selections from the 2dFGRS have not yet 
been measured, and so we cannot define the selection functions 
required to carry out our analyses on colour selected magnitude 
limited samples. 



3 THE VARIANCE OF THE COUNTS IN CELLS 

One of the most fundamental statistics accessible from the counts 
in cells is the variance of the counts. This is directly related to the 
galaxy autocorrelation function, being equal to the volume aver- 
age of the correlation function once the contribution of discreteness 
noise is removed. In later sections we will fit a parametric model to 
the one-point distribution of the counts in cells as the first stage in 
a maximum likelihood approach to fitting for the relative bias; we 
would expect that an accurate model for the PDF will reproduce the 
variance of the counts as measured in this section. Furthermore, by 
comparing the variance between spectral types, we can obtain an 
estimate of the linear relative bias parameter. 



3.1 Predictions for the variance 

The real-space and redshift-space correlation functions for the full 
set of 2dFGRS galaxies have been accurately measured by Hawkins 
et al. (2003). We have used these results to obtain predictions for 
the variance of the counts in cells using a number of approaches 
outlined below. 

If we assume that the real-space correlation function is well 
described by a power-law of the form 



then we can use the following form for the power spectrum ex- 
pressed as the variance per In k: 

A\k) = -(fcro)^r(2 - 7) sin i^— (6) 

TT Z 

For the case of Gaussian spheres of radius Rg, the variance in 
spheres, a^, is equal to the value of A^(fc) at 

\1 1/^ 



k = 



7-2 



(7) 



but this is also a good approximation to the variance in cubical cells 
if we take Rq — ij^/Vl, as described in Section l2!2l 

More accurate predictions can be obtained by directly inte- 
grating the correlation function over a cubical volume of side I to 
calculate the variance in cells. 



_1_ 



dVi dV2 ^{r) 



(8) 



We have used this method to calculate predictions based on both 
the real-space and redshift-space correlation functions of Hawkins 
et al. (2003). The variance obtained from a volume average of the 
best-fitting power-law form for the real-space correlation function 
(Hawkins et al. 2003) is almost identical to that using the scaling 
relation (Eqs.|6|&0 as one might expect. We would not expect 
our measured variances to match the real-space predictions as we 
have not made any corrections for redshift-space distortions. The 
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variances predicted using thie redsiiift-space ^(s) from Hawkins 
et al. (2003) are significantly higher. We have calculated redshift- 
space variance estimates using both a power-law approximation 
for ^(s) and a direct interpolation from the data of Hawkins et 
al. (2003), since ^(s) is not well approximated by a power law. 
The estimated variance from the interpolated f (s) data (solid lines 
in Figs|2||4|&Q 's likely to give the most accurate prediction for 
the variance of the counts in cells for the combination of spectral 
types 1 (& 2. 



3.2 Measuring the variance from the counts in cells 

A maximum-likelihood tech nique for calculating th e variance of 
counts in cells is presented bv lEfstathiou et all ^1990). In each red- 
shift shell in our cell division we can compute the statistic 



S ■ 



M 



iV) -TV, 



(9) 



where the sum extends over the M cells in the shell, and TV is the 
mean cell count; TVi are the observed counts in cell i. Note that this 
technique is based only on the measured counts in cells and does 
not use the expected counts, TVcxp, calculated in Section lT2l The 
expectation value for S is 



(5) = = TV^ 



(10) 



where n is the mean number density and V is the volume of the 
cells. The variance of S for the case where the underlying density 
fluctuations are Gaussian is given by 



Var(S') = 



2nV^(l + a') + 4nW^ + 2nW^ 
M 



(11) 



Clearly this variance will be underestimated since we have 
made two key assumptions which are not strictly correct; namely 
that the underlying fluctuations are Gaussian and that the cells 
are independent. The effect of correlations between cells was ad- 
dressed by Broadhurst, Taylor & Peacock (1995). They show that 
for all but adjacent cells the covariance in the cell counts will be 
negligible and, within the accuracy to which can be calculated, 
the error in treating even adjacent cells as independent is unimpor- 
tant. A far more serious concern is with the assumption of Gaussian 
fluctuations; even on the relatively large scales of this analysis this 
assumption is far from valid. By using the variance estimator of 
Eq.|9|on Monte-Carlo realizations of lognormal fields we find that 
the variance of S is many times larger than expected from Eg. 1111 

This method must be modified to deal with completeness 
variations in the 2dFGRS, as quantified by the survey mask. 
[Efstathiou et al. 1 1990) gives the following modified estimator for 
5* for the case where cells are incomplete due to the survey mask: 



S ^A/B, 
where 

/ 1)3 



A ■ 



B 



■E 



TV, 



I 
2 



V2 



V2 - 2 h 



(12) 



where for convenience we have used the quantities ni,vi,V2,vs, 
defined as: 



ni 



= ^ TV,, «i = ^ v., V2=Y1 "3 = ^ V^, (13) 



where Vi is the usable volume of cell i. This correction is only 
valid if only a small fraction of a cell is excluded by the mask, 
so for this test we reject all cells where the fraction of the cell of 
low completeness (< 70%) is less than 30%. We also upweight 
the counts in cells to compensate for incompleteness, based on the 
survey mask. 

For each shell, j, we define a likelihood. 



V'27rVar(S'j) 



exp 



2t/2 2\2 

- rijV a ) 



(14) 



which is calculated using Eg. 1121 as an estimator for n^V'^a^. We 
then minimise with respect to a the quantity 



(15) 



where the sum is over all shells. Note that this differs by a factor 2 
compared to £ defined by Wild et al. 

As we discussed above the variance on the estimator, Var(5), 
is in fact badly underestimated by Eg. ll ll for realistic non-Gaussian 
density fluctuations. Although in practice the procedure adopted by 
Efstathiou et al. ( 1990) of deriving errors in a from the likelihood 
function will not underestimate errors as dramatically as this, since 
the variation between shells will contribute to the error estimate 
for a, we have instead used Monte-Carlo realizations of lognormal 
fields at the appropriate variance to derive more realistic error bars. 
Even though these errors are model dependent, the density fluctu- 
ations are much more closely approximated by a lognormal model 
than the Gaussian assumption of Eg. 1111 



3.2.1 Cell variances and estimation bias 

The variances of the counts in cells calculated using the maxi- 
mum likelihood approach are shown in Fig. |2| Since we have up- 
weighted the counts in cells to compensate for completeness effects 
we should exclude cells with zero counts from the analysis. This in- 
evitably leads to an underestimated variance, particularly on small 
scales where the empty cells become significant (I < 15Ti~^Mpc; 
grey points in Fig. |2j. Below this scale we have nevertheless cal- 
culated the variance considering empty cells as being genuinely 
empty (black points in Fig. |2}. The difference between these two 
sets of points provides a means of estimating the magnitude of any 
bias introduced by excluding empty cells which will be relevant in 
the following section. 

The variances of the counts in cells for all r;-typed galaxies 
seem to be consistent with the predicted variances from the real- 
space correlation function, ^(r), which is an interesting result since 
the calculation is carried out in redshift space, with no corrections 
for redshift-space distortions. This is due to an estimation bias lead- 
ing to an underestimate of the true redshift-space variance in cells; 
the fact that this results in a variance consistent with the real-space 
variance would appear to be a coincidence. 

The estimation bias in question is a finite volume effect ex- 
actly equivalent to the integral constraint, and is discussed in detail 
in Hui & Gaztaiiaga (1999). It becomes relevant here because we 
are using radial shells which have a rather small volume because 
of the geometry of the 2dFGRS slabs. Hence the individual shell 
contributions, Sj in Eg. 1141 can be significantly biased. 

Hui & Gaztaiiaga give a useful analytical approximation for 
the magnitude of the integral constraint bias in the variance. We 
express the expected value of our variance estimator as 
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NGP SGP 




5 10 15 20 35 30 35 40 45 

cell size, ^(h ^Mpc) 

Figure 2. cr as a function of cell size, I, as measured by the maximum likelihood technique of Efstathiou et al. (1990). Filled symbols are for the NGP region, 
open symbols are SGP. The results shown are fits to the early-type galaxies (squares), late types (triangles), and to both types combined (circles). Predicted 
values are overlaid, calculated using a power-law form for A^(fc) (Eq.|5| dotted line); from the integral over the power-law fit to ^(r) (dash-dot line), from 
the integral over the power-law fit to the redshift-space correlation function, ^(s) (dashed line), and from the integral over the interpolated data table for ^(s) 
(solid line). (The points for the SGP are offset by a small amount for clarity). Points in grey are the measured values when empty cells are removed from the 
analysis; for £ > 14/i~^Mpc this has no effect on the variance measurements. The eiTor bars are derived from Monte-Carlo realizations of lognormal models 
as described in the text. 



{^■') - (l + ^) , (16) 

where A0.2 /u^ is the fractional bias in . The fractional bias can 
be approximated by the expression: 

^ = -4 + (3-2ci2)a^, (17) 

where oy is the two-point correlation function averaged over the 
whole volume in question, which in this case is the volume of a 
shell, and C12 is a coefficient derived from the hierarchical relation: 

/ cm \ ^7n-\-7n' ~2 ^ / • •\ /io\ 

\0i J^ = Cmm'K2 ^2[l,J), (18) 



where ((5r 5: 




is the connected cosmic m + m'-point function, 



neglecting Poisson terms, with at most two differing indices. We 
have used the perturbative value for ci2 from Bernardeau (1994): 

ci2 =68/21+7/3. (19) 

Using this approximation we can correct the Sj and Var(S'j) 
in Eg. 1141 to obtain a bias-corrected estimate for a. In Fig. |3|we 
show an example of the estimated variances in shells for both the 
original Efstathiou estimator (Eg. 1121 grey points and horizontal 
lines) and the bias-corrected version (black points and horizontal 
lines), for a single scale £ — 14/i^^Mpc. The correction for integral 
constraint bias shifts the maximum-likelihood variance estimator 
such that our results for the variance of the counts in cells for all 
ry-typed galaxies are now consistent with the predicted values from 
the redshift-space correlation function as shown in Fig.|4| The bias 
correction also increases the errors on the individual shell variance 
measurements. 



The effect of large-scale structure can clearly be seen in Fig.|3l 
in particular the noticeable spike around r — 250/;.^ ^Mpc corre- 
sponds to the prominent group of large clusters in the NGP region 
at around z — 0.09. This is the NGP 'hotspot' observed by Baugh 
et al. (2004). Removing the shells around this value of r does not 
alter our results appreciably relative to the magnitude of the esti- 
mated errors. 



4 ONE-POINT DISTRIBUTION FUNCTIONS 

Given a model for the one-point density distribution function, we 
can model the one-point distribution of galaxy counts as being 
equivalent to a convolution of the density field with Poisson fluctu- 
ations of intensity A = Afoxp(l + 5), i.e. 

P(iV) ^JdS ^l^E^l±^e-''^-''^'+''> fiS) (20) 

We can then use Eg. l20l as the basis for a maximum-likelihood 
method to determine the parameters of the best fit model for f{S). 
A number of models have been proposed for f{5) including the 
lognormal distribution (Coles & Jones 1991), negative binomial 
model (Fry 1986; Carruthers 1991; Gaztanaga & Yokoyama 1993; 
Bouchet et al. 1993), Edgeworth expansion around the Gaussian 
distribution (Juszkiewicz et al. 1995) and Edgeworth expansion 
around the lognormal model - the skewed lognormal approxima- 
tion of Colombi (1994). Ueda & Yokoyama (1996) consider fits of 
all of the above models to the counts in cells of a low density CDM 
N-body simulation. They find that the most satisfactory fit is given 
by the skewed lognormal model but unfortunately it is not positive 
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cell size, ^(h ^Mpc) 

Figure 4. The same as Fig.|2]but using the approximation of Hui & Gaztaiiaga to correct for the integral constraint bias 




100 200 300 



i-shell (h" Mpc) 

Figure 3. An example of the estimated cell variance in shells compared 
to the maximum likelihood value and its associated l-cr error shown in 
the horizontal black solid and dashed lines. This example is for the i = 
14h~^Mpc cell division of all r;-typed galaxies in the NGP region. The 
points in grey (offset for clarity) show the same plot before the effects of 
the integral constraint bias have been corrected for and the grey horizontal 
solid and dashed lines show the uncorrected maximum likelihood estimate 
with its associated l-c error. 



definite, making it unsuitable for tlie maximum likeliliood fitting 
procedure outlined below. The lognormal model is a satisfactory fit 
to the data on most scales except for the highly non-linear regime; it 
is also the most mathematically convenient since the version given 
below is already norinalized in the interval — 1 ^ 5 < oo and en- 
sures that {5) = 0. In later sections we will use the best fit f{5) 
to determine the parameters fo r a number of m odels for the relative 
bias, following the example of lBlantorl i200(J) . 



4.1 Lognormal model fitting 

Provided that the model for f[&) that we choose is a reasonably 
accurate approximation, the actual choice of model should not af- 
fect the conclusions we infer for the relative bias. We therefore 
use exclusively the lognormal model, given in Eq. 1211 where 
X = ln(l + S)+ o-£n/2, 

f{5)dS^ p£ expf--^V (21) 

It should be noted that we have here followed the notation of 
Coles & Jones (1991) in that ctln is the variance of the Gaussian 
model from which the lognormal is derived by transformation - 
note that Wild et al. (in preparation) use ijp for the same parameter. 
The variance for the lognormal model is given by 

{5^) = exp(a^^) - 1. (22) 

To ensure that the results of this section can be easily compared 
to those measuring the variance in the previous section we have 
used the above relation to transform our variances from uln to 

Using this model, we define a likelihood for each cell i, as 

L, = P(iV.lcrLN), 

in which the probability of observing A'^; galaxies in cell i is de- 
termined by Eq. 1201 We then find ctln for the best fit lognormal 
model by minimizing with respect to ctln the quantity 

/: = -2^1nL„ (23) 

i 

where the sum is over all cells, as defined previously in Eq. 1151 
Again, this differs by a factor 2 compared to C defined by Wild et 
al. 
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Table 2. Best-fitting lognormal model parameters and KS-test probabilities for 
the lognormal model fit. The parameters for low i cell divisions are derived 
from the data excluding empty cells and corrected for the resultant bias as 
described in the text. 
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7.0 


1.20 


6e-14 


1.05 


7e-55 


1.12 


2e-99 


8.75 


1.12 


0.001 


0.94 


3e-18 


1.03 


4e-47 


10.5 


1.05 


0.229 


0.88 


0.002 


0.97 


5e-15 


12.2 


0.99 


0.268 


0.81 


0.134 


0.90 


0.001 


14.0 


0.95 


0.105 


0.77 


0.183 


0.87 


0.015 


17.5 


0.84 


0.292 


0.67 


0.442 


0.77 


0.212 


21.0 


0.78 


0.670 


0.61 


0.193 


0.70 


0.332 


24.5 


0.68 


0.291 


0.53 


0.710 


0.61 


0.414 


28.0 


0.59 


0.992 


0.45 


0.695 


0.53 


0.987 


31.5 


0.57 


0.829 


0.44 


0.988 


0.51 


0.962 


35.0 


0.53 


0.553 


0.41 


0.625 


0.48 


0.817 


38.5 


0.48 


0.195 


0.39 


0.308 


0.44 


0.965 


42.0 


0.42 


0.850 


0.32 


0.925 


0.37 


0.825 




Figure 5. The one-point distribution function for counts in cells of all galaxies with rj type for the SGP region over a range of cell sizes, from left-right 
and top-bottom, i = 14, 17.5, 21, 24.5, 28, 31.5, 35, 38.5, 42h~^Mpc. The average of a large number of realizations of the best-fitting lognormal models, 
convolved with the same Ngxp,i as the data, are also shown, together with their 1-a spread. (Note that the y axis changes between plots in this figure). 



10 Conway et al. (The IdFGRS Team) 



Table 3. Number of empty cells in full survey data compared to the 10% & 90% percentiles of the empty cells in 
lognormal models matching the measured variances in cells. The rightmost column shows the number of empty 
cells when the Hubble Volume mock catalogues are analysed in the same way. 



early-type galaxies late-type galaxies all r;-typed HV mocks 

£(/l-lMpc) Afempty Models A^ompty Models A^empty Models (A^cmpty) 
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10.5 
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1005 
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12.2 
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321-370 
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339-387 


333 


99-128 
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14.0 


211 


89-114 


163 


88-113 


83 


19-31 


17 



4.2 Lognormal model variances and empty cells 

We have fitted lognormal models for the one-point density distribu- 
tion to both the early- and late-type galaxy populations, and to the 
whole sample. Examples of the actual one-point distributions for a 
cell size oi t = 17.5/i~^Mpc for early and late types, as well as 
for the full data set (all galaxies with r) type) are shown in Fig.|6| 
and one-point distributions as a function of cell size £ for the SGP 
region are shown in Fig. |5| Fitting models to all cells including 
empty cells results in variances, particularly on small scales, which 
are many sigma above both the variance predicted from ^(s) and 
the measurements of the variance in cells of Section|3| For exam- 
ple, the best-fitting lognormal model for I — 10.5/i^^Mpc gives 
a variance in early types of as ~ 1-9, whereas our previous mea- 
surements give (te ~ l-l- This discrepancy comes from the diffi- 
culty that the models have in reproducing the observed number of 
empty cells. 

Some insight into why empty cells have such a dramatic ef- 
fect can be gained by examining the number of empty cells in the 
data on scales where they become significant {I < 15/i~^Mpc). 
Table |3| shows the number of empty cells in the full survey data 
for each type and for all ry-typed galaxies, along with the 10% and 
90% percentiles of the distribution of empty cells in a large number 
of Monte Carlo realizations of lognormal models with variances 
matching our previous measurements of the variance in cells. A 
Poisson-sampled lognormal model with a realistic variance cannot 
explain the number of empty cells in the data; the large excess of 
empty cells will increase the variance of the best-fitting lognormal 
model. Table |3| also shows the number of empty cells which are 
found when we analyse the bubble volume mock catalogues appro- 
priately sampled to match the survey data (Cole et al. 1998; Nor- 
berg et al. 2002b) using the same method. The number of empty 
cells in the real and mock data on small scales exceeds the num- 
ber predicted by the lognormal model, which suggests that the log- 
normal model is not a good fit to the actual density distribution 
function. In the real data this discrepancy is more pronounced and 
extends to smaller scales. This is related to the void probability 
function, as discussed in Croton et al. (2004a). 

A simple solution to this problem is to fit lognormal models 
to the counts in cells excluding empty cells. Clearly this will cause 
variance estimates to be biased on scales where empty cells are 
significant (£ < 15/i^^Mpc). We have estimated the magnitude of 
this bias both by measuring the effect of excluding empty cells from 
Monte-Carlo realizations of lognormal models, and by considering 
the bias introduced into the variance measurements on small scales 
of Section |3| when we excluded empty cells; both methods give 
similar results. 

The values of cr from the lognormal fits to the early- and late- 



type subsets, as well as the full catalogue, are shown in Fig.Q The 
black points show the measurements excluding empty cells and cor- 
rected for the bias; points in grey for small i show the original mea- 
surements illustrating the bias introduced by excluding empty cells 
in a similar manner to Fig.|2| Since the likelihood function is well 
approximated by a distribution with one degree of freedom we 
have derived 1-cr errors by considering the values for ctln at which 
C — £min + 1. We have also obtained Monte Carlo estimates of the 
errors by using the procedure outlined above to fit a large number 
of models generated by randomly drawing the galaxy density con- 
trast, 5, from a lognormal distribution with the best fit value of a 
and then generating model counts from a Poisson distribution with 
intensity A = (1 + 5)Nc^p, using the same expected counts in cells 
as were calculated for the data. The magnitudes of errors using both 
methods are identical. 

As another measure of the goodness of fit for these models 
we show in Table |2| probabilities obtained by the application of a 
Kolmogorov-Smimov (KS) test to the distribution of N/N^xp- The 
KS test is not ideal for a number of reasons; it is rather insensitive to 
variation in the tails of distributions, which is where we would ex- 
pect the lognormal model will have the most difficulty in matching 
the data. Strictly speaking the estimate of Pks which we use here 
is no longer valid once the data has been used to fix any free param- 
eters of the model, although any effects should be small since the 
number of data points we use is very much larger than the number 
of free parameters. 

The KS test probabilities indicate that the lognormal model is 
an acceptable fit to the data on large scales for both early and late 
types, as well as for all galaxies. None of the values for Pks for 
£ ^ 17.5/i~^Mpc are sufficiently low to exclude the model at a 
high level of confidence. In general the early-type distribution is 
well fit by lognormal models to smaller scales than the distribution 
of late types or the combined galaxy distribution. On scales smaller 
than £ — 10.5ft~^Mpc a lognormal model is not a satisfactory fit to 
any of the distributions. We expect, based on the findings of Ueda 
& Yokoyama (1996), that the lognormal model will not adequately 
describe the data in the non-linear regime. Our results are consis- 
tent with this expectation since the variance is significantly higher 
than unity on the scales at which the lognormal model becomes 
unsatisfactory. 

Although the values of a for all rj shown in Fig.0are broadly 
consistent with the cell variance derived from ^(s) in Section lTTI 
there is a systematic trend for the fitted values of a{£) to be higher 
than predicted. The magnitude of this effect is between 10-15%, 
as shown in the ratio plot of Fig. |8| this corresponds to around a 
2-a effect. Maximum likelihood fits to lognormal models giving 
variances consistent with predictions are obtained with the intro- 
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Figure 6. An example of the one-point distribution function for counts in 
cells of size I = 17.5/i~^Mpc in the NGP region for early-type galaxies 
(top), late types (middle), and all galaxies with r;-types (bottom). The aver- 
age of a large number of realizations of the best-fitting lognormal models, 
convolved with the same Af(,xp,i as the data, are also shown, together with 
their l-tr spread. The higher variance of the early-type distribution is very 
clear. 



duction of an additional weighting factor to the hkeUhood defined 
in Eq.|23l 



■ In Li . 



(24) 



The results of applying the above weighting factor are shown 
by the filled squares in Fig. [S] In effect this modified likelihood 
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Figure 7. cr from the best fit lognormal model as a function of cell size, £. 
Filled symbols are for the NGP region, open symbols are SGP (offset as pre- 
viously). The results shown are fits to the early-type galaxies (squares), late 
types (triangles), and to both types combined (circles). Predictions are over- 
laid as in Fig.|2] The results are based on Counts in Cells with empty cells 
removed from the analysis. The small scale results (for ^ 14/i~^Mpc) are 
corrected for the bias resulting from excluding empty cells which causes the 
variance to be under-estimated, as illustrated by the grey points (see text). 
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Figure 8. Ratio plot of a(C) from the best fit lognormal model compared 
to the predicted value integrating ^(s) over cells. The crosses show the un- 
weighted maximum likelihood results for all galaxies with r} types and NGP 
and SGP regions combined. Filled squares (offset) show the results when 
an additional weighting is applied to £(o-lim) giving more weight to more 
overdense cells. 



gives more weight to the most dense regions, and suggests that the 
lognormal model is more appropriate to describe the density dis- 
tribution of high density regions. This would be consistent with 
our observation that in general the early type distribution is better 
fit by a lognormal model than the distribution of late types, since 
early types are more prevalent in dense regions. We have tested the 
weighting scheme on model data based on a lognormal model and 
verified that it does not underestimate a in this case; the fact that a 
discrepancy such as this exists indicates that the lognormal model 
is not a completely satisfactory model for the one-point distribu- 
tion functions. However, the KS test probabilities show that it is 
nevertheless an adequate prescription for our purposes, and indeed 
including the above weighting scheme does not alter our conclu- 
sions for the relative bias presented in the following section. 
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5 THE RELATIVE BIAS 

We are now in a position to consider the characteristics of the joint 
distribution of the counts in cells. We postulate a smoothed den- 
sity contrast field for galaxies 5g, which can be related to the den- 
sity field of dark matter, 5, using the general biasing framework of 
Dekel & Lahav (1999) 



h{5)S + €, 



(25) 



which in principle is able to deal with both nonlinearity and 
stochasticity. We further assume that a similar relationship holds 
independently for the separate spectral types; in other words we 
consider early and late types with their own separate smoothed den- 
sity fields denoted by 5_e and 5l respectively. Then we can specify 
the relative bias between the density fields analogously to Eg. 1251 



5l — b{5E)5E + e- 



(26) 



We have taken two approaches to quantifying the relative bias. 
Our first method considers an estimate of the galaxy density con- 
trast for each spectral type in each cell, i, which we denote as 



gE,i = NEA/NE,cxp,i, 



(27) 



for the early-type galaxies, and analogously for late types. 

Under the assumption that the density fields of both spectral 
types are related to the underlying dark matter field by a linear 
bias factor and that the only scatter is due to the Poissonian scatter 
caused by galaxy discreteness, we have for the early-type galaxies 



gE,i ~ bESi + es.i, 



(28) 



where e_B,i is the Poisson noise for the early types in cell i. gL,i can 
be defined similarly. This is the basis for the null test described in 
Section l5!2l 

In the second approach we attempt to fit to the joint distribu- 
tion of the underlying smoothed density fields. 



f{SE,5L)=f{5L\5E)f{5E 



(29) 



We follow the example of Dekel & Lahav (1999) and adopt a 
general description for relative bias: 



b{5E)5E = {5l\5e) = / dSL f{5L\SE)SL 



(30) 



The function h{5E) is characterized by Dekel & Lahav (1999) by 
defining the following moments: 



{b{5E)5 



{b\5E)5l) 



(31) 



where ge = \J {S%), as we have used previously throughout this 
paper. 

A random biasing field, e, is defined in our case as 

e = SL- {5l\5e) , (32) 
and the average biasing scatter, at, 



2 
0"6 



(33) 



These moments separate the effects of nonlinearity and 
stochasticity of the bias relation. Linear bias is often described by 
the ratio of variances of the density field. In fact this bias parame- 
ter, 6var, is a mixture of non-linear and stochastic effects and can 
be expressed in terms of the above moments as: 



6var =b'^ + CTfc, 



where 6v 
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Figure 9. The relative bias estimated from the ratio of rjE / o"L from the 
variance estimator of Efstathiou et al. (1990) for the NGP (filled triangles) 
and SGP (open circles) regions. The relative bias predictions from the real- 
space correlation functions per 'V] type of Madgwick et al. (2003) ai'e shown 
in grey. 



The bivariate lognormal model considered by Wild et al 
(2004) explicitly includes a stochastic term in the relative bias, 
which also effectively introduces a non-linear term. 



5.1 Direct estimates of &var 

We estimated the relative bias from the calculated cell variances 
shown in Fig.|4| We have used l/6var to facilitate comparison with 
other papers where the relative bias is generally defined as a ratio 
ferci = bElbh)- The results are plotted in Fig.|9|which shows that 
the relative bias is consistent with a constant 1 /fevar = 1-25 ± 0.05 
for both the NGP and SGP regions and for all cell sizes. We have 
used only the I > 14/i^^Mpc results in this estimate; for the er- 
ror calculation we have used our measured error bars at each value 
of £ and made the assumption that adjacent bins in Fig.|4|are per- 
fectly correlated. For comparison we have plotted the relative bias, 
ferci = \/i,E{r)/i^L(r), from the real-space correlation functions 
per T] type of Madgwick et al. (2003b), where we have converted 
the separation, r, to t by assuming r = Rt and using Eq.Q This 
is intended mainly for illustration since, following the bias frame- 
work of Dekel & Lahav (1999), the bias parameter formed from the 
ratio of correlation functions is not, in general, equivalent to 6var. 
Clearly, though, they are consistent within the rather large errors 
from the correlation function estimate. 

We have also calculated the relative bias from the variances 
of the lognormal model fits to the early- and late-type one-point 
distributions as shown in Fig.|7| 



bvar — 



exp(a 



2 ) 

LN,L/ 



(35) 



(34) 



The results are shown in Fig. llOl again compared to the results of 
Madgwick et al. (2003b). The relative bias factor is again consistent 
with a scale invariant bias and we derive a value of 1 /bvar ~ 1.28± 
0.05 from the £ > 14/i^^Mpc data again assuming correlation of 
adjacent bins in Fig. llOl 

5.2 The Tegmark 'null-buster' test 

iTegmarkI il999h describes a simple 'null buster' test, based on a 
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Figure 10. The relative bias estimated from the parameters of the best fit 
lognormal models for the early- and late-type one-point distributions, for 
the NGP (filled triangles) and SGP (open circles) regions. The relative bias 
predictions from the real-space correlation functions per jj-type of Madg- 
wick et al. (2003) are shown in grey. 
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Figure 11. Variation of (i^min) from the TB99 test with scale. The solid and 
dashed lines show the expected value and l-cr variation for results consis- 
tent with a linear bias relation. The solid and open squares show the average 
Vniin and its l-cr scatter measured over a number of separate cell divisions 
obtained by shifting the original divisions, for the NGP and SGP respec- 
tively. The grey solid and dashed curves show respectively the average and 
1-(T spread of Vmin for models including the effects of the selection func- 
tion variations on scales < I. 



generalized statistic, to rule out the possibility that the density 
fields traced by galaxies of two different spectral types can be re- 
lated by a simple deterministic linear biasing prescription. This test 
has been used by Seaborne et al. (1999) to compare the PSCz and 
Stromlo-APM redshift surveys. 

If we first assume that the estimated galaxy density contrasts 
for each spectral type, qe and ql, are related to an underlying dark 
matter density field by the prescription of Eg. 1281 then we can con- 
struct the difference map: 



Ag = gB - /gz 



(36) 



for different values of the relative bias factor / = 6_b/&_l. 

If the deterministic linear bias model is valid, then for the cor- 
rect value of /, the relative bias factor, Ag will consist merely 



cell size, i;(h Mpc) 

Figure 12. The value of the relative bias factor, brcl> giving the minimum 
value for v in the TB99 test. The solid triangles and open circles show the 
average b-^^i and its l-cr scatter measured over a number of separate cell 
divisions obtained by shifting the original divisions, for the NGP and SGP 
respectively. The black error bars are a more realistic estimate for the errors 
based on models including the effects of selection function variations on 
scales < I. 



of Poisson noise, which w ill have a covariance matrix given by 
iTegmark & BromlevI il999l) as 

N = (AgAg*) = 5ij [l/iVB,cxp,i + /'(l/iVi,exp,,)] • (37) 

Since we are testing the null hypothesis that ( AgAg*) = N, 
we can define ~ Ag*N~^Ag. If the null hypothesis is cor- 
rect, the quantity v = (x^ — Nc)/^/2Nc, where Nc is the number 
of cells, has an expectation of zero and standard deviation of one. 
We can therefore interpret i/ as a. measure of the significance with 
which the null hypothesis is ruled out. 

In the case where there is extra signal, S, in the covariance 
matrix of the differenc e map, so that (AgAg*) = N + S, the 
generalized x^ statistic iTegmark»,1999) is a more powerful way to 
rule out the null hypothesis: 



Ag^N SN'^Ag^ Tr(N-^S) 
[2 Tr (N-iSN-^S)]i/2 



(38) 



If there are any deviations from deterministic linear bias we 
would expect these to be correlated with large scale structure. We 
therefore choose the matrix S to be the covariance between cell 
overdensities cal culated using the redsh ift-space correlation func- 
tion calculated bv lHawkins et aljj2003h . i.e. the volume average of 
(,{sij) over cells i and j. The value of v depends only on the shape 
of S, not on its amplitude. 

Note that these tests are only valid when fluctuations are close 
to Gaussian. For this reason, when we apply this test we exclude 
cells where gE,i > 1 or gL,i > 1. The test also uses a Gaussian 
approximation for the Poisson fluctuations described by the covari- 
ance matrix N, so we apply an additional cut on cells where this 
will be particularly inaccurate, where A^cxp ^5 10. This value for 
the cut in Ncxp is a compromise between a value which renders the 
error term from using the Gaussian approximation negligible and 
the necessity of not removing too many cells. Even so we find that 
the test is not applicable for £ ^ 17.5/i^^Mpc. We have reduced 
the redshift range to 0.03 < z ^ 0.12 for the purposes of this test 
in order that our cut in A^oxp does not introduce systematic effects. 

The minimum values of u for a range of values for the cell 
size £ are shown in Fig. ^3 We also plot error bars which are 
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derived from applying the test to cell divisions which are shifted 
by up to £/y/2 parallel to the ra-z plane, relative to the original 
division. Clearly the error bars resulting from such an approach 
will underestimate the true errors. In general the values of i^min are 
not consistent with a linear and deterministic relative bias; indeed 
the significance of this detection increases as £ increases. 

The apparent detection of stochasticity at such large scales 
should be treated with some caution. The large cells contain a large 
number of galaxies so the shot noise is small, but the results will be 
very sensitive to any subtlety in the model and any small system- 
atic error in the data. A simple possibility may be that the Poisson 
sampling hypothesis of the model may not be exactly correct. An 
example of an instrumental effect that could lead to an apparent 
detection of stochasticity on large scales is an interaction between 
small-scale stochasticity, and the variation of the selection function 
on scales smaller than the cell division. This can produce an excess 
variance over the covariance matrix assumed on the Tegmark test 
("Eg. 1371 . If the null hypothesis of the Tegmark test were true on 
all scales we would expect Vmin to be consistent with zero. How- 
ever, there are differences in the distribution of early and late-type 
galaxies on small scales, as seen in the morphology-density rela- 
tion. Also there are small-scale variations in both in the angular 
variations quantified by the survey mask and in the radial varia- 
tions in the n{z). So, it is possible that galaxies of one type may 
preferentially reside in a region in a cell where the selection func- 
tion differs significantly from the cell average. After allowing for 
incompleteness, the observed covariance of Ag will be enhanced 
relative to what would be expected from simple Poisson noise as 
described by Eg. 1371 

We have considered a simple model which includes small- 
scale stochasticity, and variations in the selection function on scales 
less than I, and found that it can reproduce our results from the 
Tegmark test. We first generate linear bias models matching the 
data by drawing 5e from the best-fitting lognormal model and ap- 
plying a linear bias function to obtain 5l ■ We then generate a par- 
ent number of galaxies in each cell by Poisson sampling the density 
field with a constant sampling rate assuming all the cells are 100% 
complete and with n(2;) set to be a constant equal to the maxi- 
mum n{z) at the mean redshift of the survey. The parent galaxies 
are distributed within each cell using a modified Rayleigh-Levy 
flight model (see Peebles 1980, section 62) matching the correla- 
tion function. We have modified the original Rayleigh-Levy flight 
model so that the lacuniarity of the process is more realistic - in ef- 
fect the voids in our clustered point process are less empty. We then 
select or reject the parent galaxies based on the selection function 
at the location of each galaxy. The point processes for model early 
and late types are independent. 

The grey solid and dashed lines in Fig. ^3 show the average 
and l-cr spread of fmin when the Tegmark test is applied to our 
Rayleigh-Levy flight models. It can be seen that including the effect 
of selection function variation on scales less than the cell size can 
reproduce the kind of results seen in the data, without the need to 
invoke any non-linear or stochastic relative bias on large scales. 

Nevertheless, a significant result from the Tegmark test means 
that it is difficult to avoid the conclusion that nonlinearity and/or 
stochasticity exists in the density field at some scale. We have how- 
ever shown that the detection of this effect on a given cell scale 
can suffer from 'aliasing' of the effect from sub-cell scales. Our 
model for this effect assumes a relatively large stochasticity on 
small scales, which may be unrealistic at least for large I cells. It is 
an open question whether a more realistic model, for example one 



based on the observed morphology-density relation, would give the 
same effect; this is however beyond the scope of the current paper. 

We plot / from the best-fitting linear bias models in Fig. 1 121 
again with errors derived from cell shifts. Note that the values of / 
from this test are not strictly comparable to the other values quoted 
in this paper except in the case where a deterministic linear bias 
model is an exact representation of the data, since we have of neces- 
sity imposed a cut on Se- However we expect that on large scales 
this approximation will be close enough for comparison to be in- 
structive. If we use the slightly larger error bars derived from our 
Monte-Carlo realizations of Rayleigh-Levy flight models, and as- 
sume measurements of / in adjacent bins are correlated, we obtain 
/ = 1.28 ± 0.03 for the NOP and / = 1.16 ± 0.03 for the SGP It 
is notable that the best fit linear bias factors for the NOP and SGP 
regions do not seem to be consistent. Averaging the two regions we 
find a value for / which is consistent with our measurements of 
6var presented in the previous section. A more relevant comparison 
is with the maximum likelihood measurements of the linear bias 
parameter, which we present in the following section, where we 
find a similar discrepancy between NOP and SGP regions which 
we discuss more fully in Section F^ 

5.3 Fitting the joint counts in cells 

On the relatively large scales studied here, the results of the 
Tegmark test can be made consistent with the density fields of 
early- and late-type galaxies being related by a deterministic linear 
bias, once variations in the selection function on scales smaller than 
the cell size are considered. However, the test does not reveal any 
further details of the nature of the relative bias between galaxies 
of different spectral type. Blanton (2000) describes a more direct 
approach to measuring the relative bias and applies it to the Las 
Campanas Redshift Survey (LCRS). The basis of this approach is 
a maximum-likelihood fit to the joint counts in cells, P{Ne, Nl), 
which is simply the joint probability of the density fields convolved 
with Poisson distributions. 

If we convolve Eg. l29l with the expected Poissonian scatter we 
derive the following joint probability for the counts: 

P{Ne,Nl) = j d&E^e-^^fi&E) 

X J d&L ^e-^V(<5Ll5E), (39) 

where 

Ab = A^B,oxp(l + 5e), 

and similarly for the late types. 

We have used Eg. l39l to define the likelihood as a function of 
cln.b and a model for the relative bias /((5l|(5£;). We then found 
the maximum likelihood using a downhill simplex method (Press 
et al. 1992) and hence estimate the best-fitting relative bias. 

5.3.1 Deterministic bias models 

The conditional density distribution function, /((5t|5_E), can in 
principle describe completely the relationship between the galaxy 
density fields, including any nonlinearity or stochasticity. In this 
paper we have concentrated on bias models in which the density 
field of the late types is related to that of the early types in a deter- 
ministic manner, i.e. can be expressed in the form 

/(5l1<5e) = (5d[5l - h{5E)5E], (40) 
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Figure 13. The best-fitting linear bias parameter, bE/bi^ = l/bi, as a 
function of cell size t for the NGP (filled triangles) and SGP (open circles) 
regions. The grey error bars are derived by considering the value of 61 for 
which C = £min + 1- Black eiTor bars adjacent to selected points are a 
more realistic error estimate showing the effect on the errors of variation of 
the selection function on scales less than I. 
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Figure 14. The best-fitting power-law bias parameter, l/b\, as a function 
of cell size I for the NGP (filled triangles) and SGP (open circles) regions. 
The grey error bars are derived by considering the value of 61 for which 
C = £min + 1- Black eiTor bars adjacent to selected points are a more 
realistic error estimate showing the effect on the errors of variation of the 
selection function on scales less than t. 



where is the Dirac delta function. The possibility that the bias 
relation may exhibit additional scatter above the Poisson fluctua- 
tions is considered by Wild et al. (in preparation). 
The simplest form for the bias relation is 



6((5b)(5b = &o + feife, 



(41) 



corresponding to linear bias. Of course, this model gives unphysical 
values for 5l when fei > 1 and —1^5e < 0; in this case we set 
h{5E) ~ 0, following the example of Blanton (2000). 

A simple generalization of the bias relation which includes 
non-linear effects is a power-law bias model, 



b{SE)SE = boil + SEr 



1. 



(42) 



For both these models there is only one free parameter (bi), since 
bo is set by the requirement (Sl) = 0. 



We have fitted deterministic bias models to the counts in cells 
only for £ 14/!,~^Mpc since below this scale the problem of 
empty cells becomes significant. The parameters for the best fit lin- 
ear bias model over a range of cell sizes are shown in Fig. 1131 
(where we have used 1/bi since this corresponds to what is nor- 
mally understood as the relative bias, namely brci ~ bE/bL, where 
b_B , bL correspond to the linear bias factors for early and late types) . 

There is a systematic trend for the NGP early types to be more 
strongly biased than in the SGP. Such a discrepancy, also seen in the 
results from the modified test, was not observed in the variance 
measurements so it is important to consider whether the effect is in- 
deed as significant as it would appear. The error bars given in these 
plots are from the likelihood as a function of bi for the model, and 
do not include potential correlation between bi and (tb which could 
lead to them being underestimated. Examination of the two dimen- 
sional likelihood contours reveals that in fact the bias parameter and 
cr_B are not significantly correlated, so neglecting ctb will not lead 
to an underestimation of the errors in bi . We have also calculated 
errors by fitting Monte Carlo realizations of the bias models includ- 
ing our measured errors in ctb. These are identical in magnitude to 
the errors obtained from the likelihood function, again suggesting 
that any correlation between bi and a_B is not significant. 

If we repeat our Monte Carlo error analysis using the 
Rayleigh-Levy flight models with a selection function that varies 
on scales less than £, as in Section l5!2l the actual errors become 
rather larger. If we assume that measurements of bi in adjacent bins 
of £ are correlated we obtain bijin ~ 1.27 ± 0.04 for the NGP and 
bi.iin ~ 1-17 ± 0.04 for the SGP, which corresponds to just less 
than a 2-a discrepancy. 

The variation of the best-fitting power-law bias parameter 
(again using 1/bi for consistency) with cell size £ is shown in 
Fig. 1141 If we again assume that measurements of bi in adjacent 
bins of £ are correlated, we obtain bi,pL = 1.36 ± 0.05 for the 
NGP and bi,PL = 1.29 ± 0.04 for the SGP These results are no- 
ticeably higher than the linear bias parameters, showing that the 
assumption of linear bias pushes estimates of the bias parameter 
closer to unity to compensate for nonlinearities in the data. Once 
we account properly for non-linear biasing the bias parameters ap- 
proach consistency between regions. 

Examples of the joint counts in cells for £ = 21/i~^Mpc com- 
pared to the best fit linear and power-law bias models are shown 
in Fig. 1151 and illustrations of the power-law bias fits at a range of 
scales are shown in Fig. ^| The points show the actual counts in 
cells. The colour scale and contour levels show the expected distri- 
bution of cell counts, which we have generated using a large num- 
ber of Monte-Carlo realizations of the bias model using the same 
expected counts as the data. Poisson effects are responsible for the 
uneven contours on smaller scales; these effects can also be seen in 
the data. 

To test if the models are acceptable fits to the data we have ap- 
plied a KS test to the 1-d distributions of the late-type galaxies for 
these bias models, i.e. to the projection on the late-type axis of the 
two dimensional distributions shown in Figs ll5l & ll6l The values of 
Pks for the linear and power-law bias models are shown in the final 
two columns of Table|4l linear bias is excluded for £ < 28/i~^Mpc 
cells whereas a power-law bias model is not ruled out for any of the 
scales considered. Fig. ll7l shows the relative likelihood Cun — -Cpi 
of the two models; the power-law bias model is clearly a better fit 
on smaller scales, although the difference between the goodness of 
fit of the models decreases with scale, as one would expect from 
the theoretical prejudice that linear bias should be a good approxi- 
mation on large scales. 
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Figure 15. An example of the joint counts in cells for the SGP region with (, = 21/i~^Mpc. The colour scale and contour levels are derived from Monte Carlo 
realizations of the best fit linear bias model (left) and the best fit power-law bias model (right), using the same expected counts as the data, and the dashed lines 
indicate the 50%, 70%, 85% and 93% significance levels. The black solid line indicates a mean relative bias of 1 and the red line shows the mean relative bias 
for the model. 



Table 4. Average biasing parameters for the best-fitting power-law bias models assuming bvar from the individual lognormal model 
fits to early and late types. bi,iin and fei^pL are the maximum likelihood results of Section 15. 3. ll using realistic eiTor bars and b, b and 
(Tj, are calculated from Eas. l31l & l33l The KS test probabilities for the linear and power-law bias models are also shown. 



i{h ^Mpc) f)i,iin bi.PL b b fcvar cFb PKs(linear bias) PKs(PLbias) 



14 


O.83±0.02 


0.76±0.02 


0.68±0.06 


0.7±0.1 


0.73±0.02 


0.2±0.3 


1.3e-5 


0.269 


21 


0.82±0.03 


0.75±0.03 


0.69±0.09 


0.7±0.1 


0.73±0.04 


0.2±0.5 


0.004 


0.412 


28 


0.79±0.08 


0.75±0.08 


0.7±0.2 


0.7±0.3 


0.74±0.06 


0.1 ±2 


0.688 


0.712 


35 


O.SitO.l 


0.7±0.1 


0.7±0.2 


0.7±0.4 


0.75±0.07 


0.2±2 


0.320 


0.760 


42 


0.7±0.2 


0.7±0.2 


0.7±0.5 


0.7±0.9 


0.7±0.1 


0.2±3 


0.723 


0.957 



5.3.2 Non-linear & stochastic bias 

In the case of linear and deterministic bias, all three bias parameters 
described at the start of the section (b ,b and 6var), are equal to the 
parameter h\ in our model (Ea. l4U . We tabulate the values of b and 
b for our best-fitting power-law bias model below. In the absence of 
stochasticity we would have bvar = b, but since we already have an 
estimated value for &var from the independent fitting of lognormal 
models to the early- and late-type counts in cells, we can instead 
ask what value the stochastic bias parameter, at (Eq. I33t . should 
take under the rather strong assumption that a power-law bias com- 
pletely describes any nonlinearity. The average biasing parameters 
under this assumption are summarized in Table |4] at is generally 
~ 0.2, although the large errors on fevar mean that we cannot claim 
to require excess stochasticity above Poisson noise within our er- 
rors. A detailed model of stochastic relative bias is discussed by 
Wild et al. (in preparation); our results are consistent with the more 
accurate measurements presented in that paper. Similarly, the non- 
linearity quantified by b/b from our measurements is entirely con- 
sistent with that measured by Wild et al. Szapudi & Pan (2003) 



describe an interesting technique, based on the cumulative distri- 
bution functions, which can in principle recover the full non-linear 
bias function. This would enable a model independent measure- 
ment of nonlinearity and stochasticity, although in unmodified form 
the technique is not applicable to a flux-limited sample. 



6 DISCUSSION 

In this paper we have presented a number of measurements of the 
relative bias between early- and late-type galaxies in the 2dFGRS 
derived using the counts in cells and the joint counts in cells for the 
separate galaxy populations. The behaviour of individual estima- 
tors for the linear relative bias parameter as a function of scale, as 
well as the relationship between different estimators of the linear 
relative bias parameter as a function of scale both have important 
implications for the scale dependence, nonlinearity and stochastic- 
ity of the relative bias between early- and late-type density fields. 
We have also used a power-law bias model as the simplest model 
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including non-linear effects and demonstrated the characteristics of 
the best fit power-law bias model as a function of scale. 

6.1 Variances and the one-point distribution function 

We have presented the variance of the counts in cells using the 
maximum-likelihood technique of Efstathiou et al. (1990), which 
we have shown is subject to a significant bias when dividing the 
data into redshift shells of low volume. We have shown that the 
method can be corrected for this integral constraint bias using the 
approximation of Hui & Gaztanaga (1999). 

The one-point distribution of the counts in cells for early- and 
late-type galaxies, and the distribution for all r^-typed galaxies, has 
been fit by lognormal models, using a maximum-likelihood tech- 
nique. The variances found using this technique are significantly 
biased on small scales when empty cells are included in the anal- 
ysis, and we have been able to measure reliable variances only by 
fitting to counts in cells with empty cells removed. We have cor- 
rected our results on small scales to compensate for the inevitable 
bias resulting from the removal of empty cells. We find that the 
lognormal model is in general an adequate fit to the distribution 
functions, as measured by a Kolmogorov-Smimov test. However 
the values for the variance implied by the best fit model parame- 
ters are slightly high in comparison with both predictions from the 
correlation functions and relative to the direct counts-in-cells vari- 
ance measurements presented in this paper. The fact that this bias 



Table 5. Average bias parameters over all scales from I > 14fc~^Mpc 
for all of the measurements presented in the paper EiTor bars are derived 
assuming measurements for adjacent bins in i scales are coiTelated. 



Bias measurement 


NGP 


SGP 


1/bvar (from Efstathiou cr{l)) 


1.24±0.06 


1.26±0.04 


l/&var (from (TLN fits) 


I.28±0.05 


1.27±0.04 


lin (maximum likelihood) 


1.27±0.04 


1.17±0.04 


/ l/fei4in (Tegmark test) 


1.28±0.03 


1.16±0.03 


l/fei,PL 


1.36±0.05 


1.29±0.04 



can be corrected by introducing a weighting scheme giving more 
weight to regions of higher density contrast suggests that the log- 
normal model is a relatively crude approximation to the true dis- 
tribution. It is likely that a generalized lognormal model, such as 
the 'skewed' lognormal model (SLNDFk) (Colombi 1994; Ueda & 
Yokoyama 1996), would be a better approximation. Unfortunately, 
the SLNDFk cannot be used in our maximum likelihood approach 
since it is not positive definite, and therefore is not strictly speaking 
a distribution function. 
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Figure 17. The relative likelihoods of the two deterministic bias models 
when fit to the joint counts-in-cells distribution. A value of implies that 
both models are equally good fits to the data, positive values indicate that 
the power-law model is a better fit. Results are shown for the NGP (black 
squares) and SGP (open circles). The dashed line shows the limit in I be- 
yond which a linear bias models is not ruled out by the KS tests. The results, 
and the error bars shown, are obtained from fits to cell divisions shifted from 
the original cell division in same manner as was used for the Tegmark tests 
in Section lS^ 

6.2 Comparison of relative bias parameters 

We present in Table|5|a comparison of the relative bias parameters 
from all of the measurements presented in the paper. We have av- 
eraged the bias measurements for all scales with I > 14/i^^Mpc 
which we expect to be unaffected by biases from empty cells. The 
error bars on each average are obtained assuming that the measure- 
ments in adjacent bins in £ are perfectly correlated, which is a better 
approximation than assuming the measurements on separate scales 
are independent. Where relevant we have also used the more real- 
istic error bars obtained from our Rayleigh-Levy flight models. 

As previously noted, the results for l/6var are consistent be- 
tween regions and also consistent between measurements from di- 
rect variance estimation and fitting lognormal models to the one- 
point distribution. 

Comparing the two estimates of the linear relative bias param- 
eter l/6i,iin, from the maximum likelihood method and from the 
Tegmark test, we find in both cases a significant discrepancy be- 
tween NGP and SGP regions. The magnitude of this discrepancy is 
around 2-a. On the other hand the power-law bias measurements 
are approximately consistent between regions at a value of 6i,pl 
which is further from unity. As we noted in Section 15. 3. li the as- 
sumption of linear bias when fitting to joint counts in cells which 
contain a significant degree of nonlinearity pushes the best fit rel- 
ative bias closer to unity. This effect was also noted by Wild et al. 
(in preparation). It is likely that the apparent discrepancy between 
NGP and SGP linear bias parameters is also partly an artefact pro- 
duced when non-linear joint distributions are fit with a linear bias 
model. 



6.3 Scale dependence of the relative bias 

In general, the relative bias is expected to be scale dependent on 
small scales (r < ro). The scale at which the bias relation becomes 



scale independent depends on the scales over which the biasing 
mechanism(s) operates. Non-local bias models (Bower et al. 1993, 
Matsubara 1999) are those on which the physical processes acting 
to produce the bias act on scales larger than those defined by the 
movement of massive particles, for example those models where 
radiation from QSOs has a significant effect. Local bias models 
(e.g. Narayanan, Berlind & Weinberg, 2000) are those which are 
defined by some property of the local matter field, for example its 
density. 

Narayanan, Berlind & Weinberg (2000) determine the vari- 
ation with scale of a number of local and non-local bias models 
applied to N-body simulations. A general conclusion of this work 
is that local bias models are genetically unable to influence the bi- 
asing relation on scales greater than r = 8/i~^Mpc, which corre- 
sponds to £ ~ 12/i~^Mpc in this work. Although there does appear 
to be some variation of the best fit linear bias parameter on scales 
£ > 15h^^, when we factor in the larger error bars derived from 
models including selection function variation across cells in a more 
realistic way, the significance of any variation becomes negligible. 
Even if the variation in the linear bias parameter were significant 
it would not necessarily imply scale dependence of the bias since 
there is no significant scale dependence of the best fit power-law 
bias parameter. This illustrates the interdependence of non-linear, 
non-local and stochastic biasing effects. We conclude that any non- 
local contribution to the relative bias cannot be a dominant effect 
on large scales. 

A special case of local relative bias which was considered in 
detail by Narayanan, Berlind & Weinberg (2000) is a local mor- 
phology density relation, of the type measured in the local envi- 
ronment of clusters and groups by e.g. Postman & Geller (1984). 
A general conclusion for the relative bias produced by such a lo- 
cal effect is that the constant bias factor to which the scale depen- 
dent bias asymptotes on large scales is not equal to unity; assigning 
galaxy types based on local density produces a difference in cluster- 
ing strength of the different galaxy types on all scales. Our results 
are fully consistent with this picture, which leads on to the question 
of whether the relatively well studied morphology-density relation 
can be held solely responsible for the relative bias measured in the 
2dFGRS. 
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